pyiwn: A Python-based API to access Indian Language WordNets

نویسندگان

  • Ritesh Panjwani
  • Diptesh Kanojia
  • Pushpak Bhattacharyya
چکیده

Indian language WordNets have their individual web-based browsing interfaces along with a common interface for IndoWordNet. These interfaces prove to be useful for language learners and in an educational domain, however, they do not provide the functionality of connecting to them and browsing their data through a lucid application programming interface or an API. In this paper, we present our work on creating such an easy-to-use framework which is bundled with the data for Indian language WordNets and provides NLTK WordNet interface like core functionalities in Python. Additionally, we use a pre-built speech synthesis system forHindi language and augment Hindi data with audios for words, glosses, and example sentences. We provide a detailed usage of our API and explain the functions for ease of the user. Also, we package the IndoWordNet data along with the source code and provide it openly for the purpose of research. We aim to provide all our work as an open source framework for further development.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sophisticated Lexical Databases - Simplified Usage: Mobile Applications and Browser Plugins For Wordnets

India is a country with 22 officially recognized languages and 17 of these have WordNets, a crucial resource. Web browser based interfaces are available for these WordNets, but are not suited for mobile devices which deters people from effectively using this resource. We present our initial work on developing mobile applications and browser extensions to access WordNets for Indian Languages. Ou...

متن کامل

A Practical Python API for Querying AFLOWLIB

Conrad W. Rosenbrock Department of Physics and Astronomy, Brigham Young University, Provo, Utah 84602, USA. (Dated: October 3, 2017) Abstract Large databases such as aflowlib.org provide valuable data sources for discovering material trends through machine learning. Although a REST API and query language are available, there is a learning curve associated with the AFLUX language that acts as a ...

متن کامل

Brahmi-Net: A transliteration and script conversion system for languages of the Indian subcontinent

We present Brahmi-Net an online system for transliteration and script conversion for all major Indian language pairs (306 pairs). The system covers 13 Indo-Aryan languages, 4 Dravidian languages and English. For training the transliteration systems, we mined parallel transliteration corpora from parallel translation corpora using an unsupervised method and trained statistical transliteration sy...

متن کامل

The NLTK FrameNet API: Designing for Discoverability with a Rich Linguistic Resource

A new Python API, integrated within the NLTK suite, offers access to the FrameNet 1.7 lexical database. The lexicon (structured in terms of frames) as well as annotated sentences can be processed programatically, or browsed with human-readable displays via the interactive Python prompt.

متن کامل

IndoWordNet and its Linking with Ontology

Reasoning about natural language requires combining semantically rich lexical resources with world knowledge, provided by ontologies. In this paper, we describe linking of WordNets of Indian languages with an upper ontology SUMO (Suggested Upper Merged Ontology). This creates multilingual resource for Indian languages which can be used in various natural language processing applications. This p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017